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We investigate the performance of the hybrid Monte Carlo algorithm in updating non-trivial global topological 
structures. We find that the hybrid Monte Carlo algorithm has serious problems decorrelating the global topolog- 
ical charge. This represents a warning which must be seriously considered when simulating full QCD, regardless 
of the number and type of fermions, with this or any similar algorithm. Simulated tempering is examined as a 
means of accelerating the decorrelation. 



1. Introduction 

All production runs in full QCD use molecular 
dynamics based algorithms. For integer multiples 
of four staggered or two Wilson fermions one has 
the exact hybrid Monte Carlo (HMC), while for 
other multiples there is the very similar hybrid 
molecular dynamics algorithm. 

We examine here how effective the HMC is in 
decorrelating the topology for both full QCD with 
four staggered fermions and pure SU(3) gauge 
theory The latter we compare with an over- 
relaxed heat-bath algorithm. Although the only 
HMC algorithm with four staggered fermions has 
been investigated, the conclusions drawn are valid 
for any algorithm based on molecular dynamics. 

An ensemble containing the correct topology^ 
is crucial for the rj mass, for example. It can also 
become relevant for other quantities, for example 
the proton mass, at some level of accuracy. One 
criterion that the ensemble must fulfill is that the 
global topological charge averages to zero; we find 
that this is not so for a long run at our smallest 
quark mass. 

Other simulations involving the topology in the 
presence of fermions have been presented in the 
literature jl]-[| , where the existence of long range 
correlations have been reported. 

We have a large set of configurations generated 
with the HMC, originally prepared with the aim 
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of studying the spin of the proton Q . As the op- 
erator involves the topology, the longest topolog- 
ical auto-correlation length determines the num- 
ber of independent configurations. This is the 
auto-correlation of the global topological charge 
on cooled Jj| configurations. 

The conclusions drawn here require only that 
our method sees the most slowly updated modes 
associated with the 'topological' content of the 
lattice configurations. For this purpose any rea- 
sonable definition of the global topological charge 
that is not affected by noise is adequate; we em- 
ployed the field theoretic definition of Q, mea- 
sured on cooled configurations. 

2. Results 

We use the $ algorithm of reference ||, with 
a 16 3 x 24 lattice at a coupling (3 — 5.35 for full 
QCD, and a 16 3 x 16 lattice at (3 = 6.00 for pure 
SU(3). The two relevant units for comparing the 
auto-correlations are then fictitious molecular dy- 
namics time in units of r, and wall clock time. 

Let us begin with the results for quark mass 
m = 0.01. This yields a pion to rho mass ratio of 
m w /m p ~ 0.5 [Q, so we are still quite far from the 
physical value. The lattice spacing is a ~ 0.14fm 
(from m p ), which gives a lattice volume of V3 ~ 
(2.2fm) 3 . 

We performed a rather long simulation, with 
a total length after thermalisation of r ~ 500 in 
units of fictitious time. Further details may be 
found in || . Fig. |l| shows the time history of the 
topological charge, where it is clear that the HMC 
is unable to change the global topological modes 
efficiently, leading to very long auto-correlations. 
The value of the topological charge got stuck at 
around Q ~ —2, and its value averaged over all 
configurations generated so far is decidedly non- 
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Table 1 

Results from, and parameters used for the Hybrid 
Monte Carlo (HMC) runs, and, in the last row, 
the heat-bath over-relaxed (HBOR) run. The full 
QCD runs with four staggered fermions were per- 
formed on a 16 3 x 24 lattice, the pure SU(3) runs 
on a 16 3 x 16 lattice. The length of each tra- 
jectory in fictitious time is T tra j. The last three 
columns give the integrated auto-correlation time 
of Q and the plaquette II in units of fictitious time 
and wall clock hours on the 25 Gflops Quadrics. 



Figure 1. Time history, in units of molecular 
dynamics time r, of the topological charge Q for 
the HMC simulation at (3 — 5.35 and m = 0.01 
on a 16 3 x 24 lattice. 



zero: (Q) = —1.7(4). A very rough estimate of 
the integrated auto-correlation time Tq from a 
blocking analysis of the data gives Tq > 2 x 10 2 
in units of molecular dynamics time. The sim- 
ulations were performed on the 25 Gflops APE 
tower, with around 50% efficiency. On this ma- 
chine Tq ~ 200 corresponds to about three days 
of computer time, a considerable amount. Notice 
that most simulations at comparable values of (3 
and m presented in the literature have r ~ 100. 

To further investigate the behaviour of the 
HMC, we have performed simulations with larger 
quark masses, m = 0.035 and m = 0.05, and 
in the quenched case, which represents the large 
quark mass limit of full QCD. Quenched simula- 
tions were performed at (3 = 6.0, and here HMC 
has been compared with the over-relaxed heat- 
bath. One cycle of this algorithm consists of 5 mi- 
crocanonical updates followed by a pseudo-heat- 
bath update. 

In Table | we give the parameters used in our 
simulations and the estimates of the integrated 
auto-correlation time of the topological charge 
Tq . The estimates should be regarded as a lower 
limit for m — 0.01, and with a possible uncer- 
tainty of 10-20% for all other values. 

There are three major results that have 
emerged from this work: 

1. In HMC simulations the integrated auto- 
correlation time of the global topological 
charge decreases as the length of the trajec- 
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tory increases. If expressed in terms of com- 
puter time, then trajectories longer than 
one unit in molecular dynamics time seem 
to be slower. 

2. For pure gauge theory at (3 = 6.0 the HMC 
algorithm is about two orders of magni- 
tude slower in decorrelating Q than the lo- 
cal over-relaxed algorithm. For our 'best' 
set of HMC parameters we find that the 
plaquette decorrelates about 30 times more 
slowly, which agrees with the results of |J. 
The plaquette auto-correlation time in units 
of t is constant, though. However, Q 
decorrelates 60 times slower with the HMC 
than with the heat-bath over-relaxed for our 
'best' parameters. Furthermore the auto- 
correlation of Q is far more sensitive to the 
choice of parameters than the plaquette. 

3. The auto-correlation time Tq rapidly in- 
creases with decreasing quark mass, in 
terms of both CPU time and molecular dy- 
namics time. The auto-correlation time 
for the plaquette, on the other hand, re- 
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mains similar in units of r. The time re- 
quired appears to increase faster than the 
T Q w 1/m 2 - 5 expected §-[yJ if one uses 
1 / m 2 = 1/m as the relevant physical quan- 
tity, a factor 1/m from the matrix inversion, 
and another « 0.5 from the change in step 
size and acceptance rate. Note that in our 
rather long HMC simulation at m = 0.01 
(r ~ 500) the ensemble is not yet sufficient 
to determine the auto-correlation at all well; 
we cannot even estimate how long the run 
should have been to get (Q) w 0. 

3. Improvements 

It may also be possible to improve the per- 
formance of the HMC algorithm at small masses 
via the use of simulated tempering [O. In sim- 
ulated tempering, usually the coupling becomes 
a dynamical parameter in the simulation. We 
are currently investigating promoting the quark 
mass to a dynamical variable in the simulation. 
This helps in decorrelating the topological charge 
both because there are more fluctuations when 
the mass is large, and because the HMC algo- 
rithm itself is faster for larger masses. 

In simulated tempering one has an ordered set 
of masses, chosen such that the action histogram 
of adjacent masses has some overlap. The mass 
can then change to an adjacent mass if the config- 
uration lies in this overlapping region, ie., if the 
configuration is a representative member of the 
probability distribution for each of the masses. 

We are currently investigating simulations us- 
ing masses between 0.01 and 0.035. From these 
preliminary investigations we estimate that the 
improvements will more than compensate the ad- 
ditional overhead if the masses used range be- 
tween m < 0.01 and 0.02. 

Finally, we mention that algorithms that are 
not based on equations of motion, (for example, 
Liischer's p"3[ multi-boson algorithm), may per- 
form better with respect to the topological modes 
than the HMC algorithm. This is also being ex- 
amined. 

4. Conclusions 

In conclusion we have shown that the HMC al- 
gorithm has serious problems decorrelating global 
topogical modes, more serious than those associ- 
ated with commonly studied quantities. For full 
QCD the algorithm appears to slow down much 
more rapidly than previously expected. We stress 



again that this warning must be seriously consid- 
ered when simulating full QCD. It is especially 
important when studying quantities related to the 
topology, but may be less relevant in the calcula- 
tion of the mass spectrum (with the exception of, 
for example, the rf mass), since an effective de- 
coupling from the topological modes is expected. 
Note also that this slowing down is independent of 
both the number of fermions and type of fermion 
simulated, as the underlying dynamics of the al- 
gorithm are the same as the cases studied here. 
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